Market Basket Analysis

Executive Summary

Introduction

Danielle has asked the team to perform a market basket analysis to help Blackwell’s board of directors better understand the clientele that Electronidex is currently serving and if Electronidex would be an optimal acquisition.

A dataset of transactions has been provided. The dataset contains 9835 transactions and 125 different products over a 30-day period, or about 327 transactions a day. This tells us the retailer is neither large, nor small.

Results, conclusions and recommendations

After the analysis, we conclude Electronidex’s sales are to be categorized in two forms: retail (B2C) and corporate (B2B). Had we had this information previous to our analysis, it would have saved time in the exploration phase. Interesting patterns and item relationships found:

Retail by products:

lhs Var.2 rhs
{Samsung Monitor} => {CYBERPOWER Gamer Desktop}
{CYBERPOWER Gamer Desktop} => {Apple Earpods}
{Apple Earpods} => {CYBERPOWER Gamer Desktop}

Retail by category:

lhs Var.2 rhs
4 {Accessories,Keyboard} => {Desktop}
5 {Accessories,Keyboard} => {Desktop}
6 {Computer Cords,Mouse and Keyboard Combo} => {Laptops}

Corporate by products:

lhs Var.2 rhs
7 {HP Laptop} => {iMac}
8 {ViewSonic Monitor} => {iMac}
9 {iMac} => {HP Laptop}

Corporate by categories:

lhs Var.2 rhs
10 {Laptops} => {Desktop}
11 {Laptops,Monitors} => {Desktop}
12 {Desktop} => {Laptops}

Would Blackwell benefit in selling Electronidex’s items? As a conclusion, we do recommend that Blackwell acquires Electronidex for the following reasons:

  • Product portfolio diversification and customer base expansion: Having previously analyzed the current products that Blackwell supplies to its customers, we believe that new product categories will help reach new customers. Additionally, all clients acquired with Eltronidex will become part of Blackwell’s customer base. The combination of diversifaction and expansion will consequentially increase Blackwell’s revenue.

  • Beneficial Product Association: Since Blackwell currently works with close to 50% of brands shared by Electronidex; Blackwell can take advantage of the discovered associations. This will imply the average amount of Blackwell’s items per transaction will be increased.

  • New potential strategies to adopt by Blackwell’s sales team: The aforementioned results opens new marketing possibilities. Extracting the associations with high confidence and low support (sold less often), it is possible to boost sales for those products (through advertisement and other sales actions). The increase in sales number of the said products will bring in increased revenue since we are confident the new transactions will increase sales volume. Blackwell could create some item packs of those items that are more often purchased at the same transaction, in order to also increase sales volume. Additionally, the rules that we found can be also helpful to provide recommendations to users when they purchase through the e-commerce platorm.

Limitations and observations:

Properties of the dataset:

  1. The iMac is the product most bought, in 20% of all transactions. This high number stands out considering the large variety of products, especially being the iMac a pricey product. If this number is representative of all sales throughout the year, then Electronidex is potentially profitable.
  2. The mean of items bought per transaction is almost 5. Logically, we would say most people in the real world would buy 1 or 2 items per transactions most frequently, in an eletronics store.

Technical Report

Preprocessing

transactions as itemMatrix in sparse format with
 9835 rows (elements/itemsets/transactions) and
 125 columns (items) and a density of 0.03506172 

most frequent items:
                    iMac                HP Laptop CYBERPOWER Gamer Desktop 
                    2519                     1909                     1809 
           Apple Earpods        Apple MacBook Air                  (Other) 
                    1715                     1530                    33622 

element (itemset/transaction) length distribution:
sizes
   0    1    2    3    4    5    6    7    8    9   10   11   12   13   14 
   2 2163 1647 1294 1021  856  646  540  439  353  247  171  119   77   72 
  15   16   17   18   19   20   21   22   23   25   26   27   29   30 
  56   41   26   20   10   10   10    5    3    1    1    3    1    1 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   2.000   3.000   4.383   6.000  30.000 

includes extended item information - examples:
                            labels
1 1TB Portable External Hard Drive
2 2TB Portable External Hard Drive
3                   3-Button Mouse

Feature Engineering

# Counting number of items
nitems <- c()
for (i in 1:nrow(trans_df)) {
    nitems <- c(nitems, sum(trans_df[i, ]))
}
trans_df$nitems <- nitems
trans_df$laptops <- trans_df[, which(colnames(trans_df) == "LG Touchscreen Laptop")] + 
    trans_df[, which(colnames(trans_df) == "Acer Aspire")] + 
trans_df[, which(colnames(trans_df) == "HP Laptop")] + trans_df[, which(colnames(trans_df) == 
    "ASUS Chromebook")] + trans_df[, which(colnames(trans_df) == "Apple Macbook Pro")] + 
    trans_df[, which(colnames(trans_df) == "Apple MacBook Air")] + trans_df[, 
    which(colnames(trans_df) == "Dell Laptop")] + trans_df[, which(colnames(trans_df) == 
    "Eluktronics Pro Gaming Laptop")] + trans_df[, which(colnames(trans_df) == 
    "Alienware AW17R4-7345SLV-PUS 17\" Laptop")] + trans_df[, which(colnames(trans_df) == 
    "HP Notebook Touchscreen Laptop PC")]
trans_df$desktop <- trans_df[, which(colnames(trans_df) == "Lenovo Desktop Computer")] + 
    trans_df[, which(colnames(trans_df) == "iMac")] + trans_df[, which(colnames(trans_df) == 
    "HP Desktop")] + trans_df[, which(colnames(trans_df) == "ASUS Desktop")] + 
    trans_df[, which(colnames(trans_df) == "Dell Desktop")] + trans_df[, which(colnames(trans_df) == 
    "Intel Desktop")] + trans_df[, which(colnames(trans_df) == "Acer Desktop")] + 
    trans_df[, which(colnames(trans_df) == "CYBERPOWER Gamer Desktop")] + trans_df[, 
    which(colnames(trans_df) == "Dell 2 Desktop")]
trans_df$tablet <- trans_df[, which(colnames(trans_df) == "iPad")] + trans_df[, 
    which(colnames(trans_df) == "iPad Pro")] + trans_df[, which(colnames(trans_df) == 
    "Fire HD Tablet")] + trans_df[, which(colnames(trans_df) == "Samsung Galaxy Tab")] + 
    trans_df[, which(colnames(trans_df) == "Kindle")]
trans_df$printer <- trans_df$`Epson Printer` + trans_df$`HP Wireless Printer` + 
    trans_df$`Canon Office Printer` + trans_df$`Brother Printer` + trans_df$`DYMO Label Manker`
trans_df$nmain <- trans_df$printer + trans_df$laptops + trans_df$desktop + trans_df$tablet
trans_df$ncomp <- trans_df$nitems - trans_df$nmain
trans_df$value <- 10 * trans_df$nmain + trans_df$ncomp

Creating rules via apriori algorithm

Rules for products in corporate transactions

lhs Var.2 rhs support confidence lift count average
[1] {HP Laptop} => {iMac} 0.1273350 0.4267662 1.132930 743 2.4153369
[9] {ViewSonic Monitor} => {iMac} 0.0815767 0.4803229 1.275106 476 1.7127774
[2] {iMac} => {HP Laptop} 0.1273350 0.3380346 1.132930 743 1.5000665
[3] {Lenovo Desktop Computer} => {iMac} 0.0990574 0.4355690 1.156299 578 1.0105471
[7] {Dell Desktop} => {iMac} 0.0920308 0.4372964 1.160885 537 0.6431045
[4] {iMac} => {Lenovo Desktop Computer} 0.0990574 0.2629663 1.156299 578 -0.7698555
[10] {iMac} => {ViewSonic Monitor} 0.0815767 0.2165605 1.275106 476 -1.0079418
[8] {iMac} => {Dell Desktop} 0.0920308 0.2443130 1.160885 537 -1.3475267
[5] {CYBERPOWER Gamer Desktop} => {iMac} 0.0956298 0.3798502 1.008383 558 -1.4284940
[6] {iMac} => {CYBERPOWER Gamer Desktop} 0.0956298 0.2538672 1.008383 558 -2.7280144

Rules for categories in corporate transactions

lhs Var.2 rhs support confidence lift count average
[1] {Laptops} => {Desktop} 0.5407027 0.7879620 1.008059 3155 2.0262154
[7] {Laptops,Monitors} => {Desktop} 0.3076264 0.8100181 1.036276 1795 1.7223116
[2] {Desktop} => {Laptops} 0.5407027 0.6917343 1.008059 3155 1.0851044
[3] {Monitors} => {Desktop} 0.4327335 0.7868495 1.006636 2525 0.7676997
[8] {Desktop,Monitors} => {Laptops} 0.3076264 0.7108911 1.035976 1795 0.7320255
[5] {Monitors} => {Laptops} 0.3797772 0.6905578 1.006345 2216 -0.7576991
[9] {Desktop,Laptops} => {Monitors} 0.3076264 0.5689382 1.034514 1795 -0.7578959
[10] {Computer Mice} => {Desktop} 0.2904884 0.7821874 1.000672 1695 -1.2056863
[4] {Desktop} => {Monitors} 0.4327335 0.5536067 1.006636 2525 -1.5134233
[6] {Laptops} => {Monitors} 0.3797772 0.5534466 1.006345 2216 -2.0986520

Rules for products in retailers transactions

lhs Var.2 rhs support confidence lift count average
[7] {Samsung Monitor} => {CYBERPOWER Gamer Desktop} 0.00850 0.2312925 2.7210884 34 2.4223682
[1] {CYBERPOWER Gamer Desktop} => {Apple Earpods} 0.01425 0.1676471 1.3331774 57 2.0435876
[2] {Apple Earpods} => {CYBERPOWER Gamer Desktop} 0.01425 0.1133201 1.3331774 57 1.2268133
[5] {3-Button Mouse} => {Apple Earpods} 0.00925 0.2000000 1.5904573 37 0.7512219
[9] {Apple Macbook Pro} => {Apple Earpods} 0.00800 0.2176871 1.7311099 32 0.6750203
[8] {CYBERPOWER Gamer Desktop} => {Samsung Monitor} 0.00850 0.1000000 2.7210884 34 0.4484622
[6] {Apple Earpods} => {3-Button Mouse} 0.00925 0.0735586 1.5904573 37 -1.1497497
[10] {Apple Earpods} => {Apple Macbook Pro} 0.00800 0.0636183 1.7311099 32 -1.6413135
[4] {Apple Earpods} => {Apple MacBook Air} 0.00975 0.0775348 0.5283461 39 -2.3047997
[3] {Apple MacBook Air} => {Apple Earpods} 0.00975 0.0664395 0.5283461 39 -2.4716106

Rules for categories in retailers transactions

lhs Var.2 rhs support confidence lift count average
[1] {Monitors} => {Desktop} 0.05375 0.3272451 1.1452145 215 4.7116644
[2] {Desktop} => {Monitors} 0.05375 0.1881015 1.1452145 215 2.9661851
[3] {Computer Mice} => {Desktop} 0.03825 0.2875940 1.0064531 153 1.1870590
[7] {Keyboard} => {Desktop} 0.03625 0.2859961 1.0008611 145 0.8514579
[4] {Desktop} => {Computer Mice} 0.03825 0.1338583 1.0064531 153 -0.7414707
[9] {Active Headphones} => {Laptops} 0.03625 0.2449324 0.7825317 145 -0.9944768
[8] {Desktop} => {Keyboard} 0.03625 0.1268591 1.0008611 145 -1.1448269
[5] {Monitors} => {Laptops} 0.03725 0.2267884 0.7245637 149 -1.4346892
[10] {Laptops} => {Active Headphones} 0.03625 0.1158147 0.7825317 145 -2.6141875
[6] {Laptops} => {Monitors} 0.03725 0.1190096 0.7245637 149 -2.7867154

Andreu Oros, Sergi Pallice, Joël Ribera

2019-07-26